Nielsen | Data Engineer - II Interview Experience | 2.5 YoE



Round 1 โ€“ DSA and SQL (Technical)

General Introduction:

๐Ÿ”น Discussed my previous tech stack.

๐Ÿ”น Questions around data volume Iโ€™ve worked with, nature of my work, and the impact it had.

Coding Problems:

๐Ÿ”น Longest Substring with Unique Characters

Input: str = "aabcdeeefijklmno"

Expected Output: "fijklmno"

Approach: Sliding window technique with a hash set to track seen characters.

๐Ÿ”น Check if Two Strings are Anagrams (O(n), no sorting allowed)

Input: s1 = "tan", s2 = "ant"

Expected Output: Yes

Approach: Use a hash map (dictionary) to count character frequencies.

SQL Question:

๐Ÿ”น Problem: From a product pricing table, find product names where the prices are strictly increasing over months.

Schema:

Fields: product_name, product_id, price, price_change_month

Sample Data:

Expected Output: b

Approach: Use window functions (LAG, ROW_NUMBER) or self-joins to compare consecutive rows and check for strictly increasing price trends.

Round 2 โ€“ Data Modeling and Spark

Project Discussion:

๐Ÿ”น Detailed discussion on my previous project.

Spark-related questions focused on:

๐Ÿ”น Data skewness handling

๐Ÿ”น Code optimizations and partitioning strategies

Data Modeling Task:

๐Ÿ”น Design a ride-booking app like Uber/Ola.

Created a Galaxy schema with fact and dimension tables to represent:

Users

Drivers

Rides

Payments

Locations

Ratings

OOP Code Design for Ride Booking System:

๐Ÿ”น A simplified structure using classes and inheritance:

class Users(driver_available, fare_calc):
    def __init__(self, d_id, dname, location):
        self.d_id = d_id
        self.d_name = dname
        self.location = location
        self.fare = fare_calc.fare_func()


    def request_ride(self, location, vehicle_type):
        return driver_available.check_driver(location, vehicle_type)


class Driver:
    def driver_curr_location(self, driver_id):
        return (lat, long)


    def driver_veh_type(self, d_id):
        return v_type


class DriverAvailable(Driver):
    def check_driver(self, location):
        return is_there


class FareCalc:
    def fare_func(self, location, weather, vehicle_type):
        return calculated_fare


class Status:
    def driver_accept(self):
        return True/False


    def user_accept(self):
        return True/False


class Payments(FareCalc, Status, Users, DriverAvailable, Driver):
    def payment(self):
        if self.driver_accept() and self.user_accept():
            amount = self.fare_func(location, weather, vehicle_type)
            return amount, self.driver_curr_location(driver_id)
        else:
            return "ride_cancelled"

Round 3 โ€“ Situational Design + Hiring Manager (With Director of Engineering)

Discussion Highlights:

๐Ÿ”น Introduction and detailed discussion around my role at Nielsen.

๐Ÿ”น Responsibilities, impact areas, and future expectations.

Situational Design Problem:

๐Ÿ”น How can Amazon Prime detect that a user logging in from different accounts and locations (e.g., India and US) is the same person, to show consistent movie recommendations?

Solution Approach:

๐Ÿ”น Use probabilistic matching based on a combination of soft and hard identifiers.

Attributes Considered:

First & Last Name

Date of Birth

City of Birth

Phone Number (if same โ†’ high confidence match)

Email Address (if same โ†’ high confidence match)

Time of Birth (accurate but not user-friendly)

Personal security questions (e.g., favourite animal, fatherโ€™s name)

Matching Flow:

Step 1: Match on phone or email โ†’ strong indicator (1:1 match)

Step 2: If not matched, apply layered filters on remaining attributes to narrow down potential matches.

Step 3: Prompt user confirmation via a non-sensitive UI message showing:

Last login device

Last login time

Location


Final Conversation:

Discussed expectations from the role, growth plans, and received the final offer ๐ŸŽ‰